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The past research on the state complexity of operations on regular languages is examined, and a 
new approach based on an old method (derivatives of regular expressions) is presented. Since state 
complexity is a property of a language, it is appropriate to define it in formal-language terms as the 
number of distinct quotients of the language, and to call it "quotient complexity". The problem of 
finding the quotient complexity of a language f(K,L) is considered, where K and L are regular 
languages and / is a regular operation, for example, union or concatenation. Since quotients can 
be represented by derivatives, one can find a formula for the typical quotient of f(K, L) in terms of 
the quotients of K and L. To obtain an upper bound on the number of quotients of f(K, L) all one 
has to do is count how many such quotients are possible, and this makes automaton constructions 
unnecessary. The advantages of this point of view are illustrated by many examples. Moreover, 
new general observations are presented to help in the estimation of the upper bounds on quotient 
complexity of regular operations. 

1 Introduction 

It is assumed that the reader is familiar with the basic concepts of regular languages and finite automata, 
as described in many textbooks. General background material can be found in Dominique Perrin's E4l 
(1990) and Sheng Yu's [29] (1997) handbook articles; the latter has an introduction to state complexity. 
A more detailed treatment of state complexity can be found in Sheng Yu's survey [30]. The present paper 
concentrates on the complexity of basic operations on regular languages. Other aspects of complexity 
of regular languages and finite automata are discussed in El 13 El El [T31 23 |27l [28J ; this list is not 
exhaustive, but it should give the reader a good idea of the scope of the work on this topic. 

2 State complexity or quotient complexity? 

The English term state complexity of a regular language seems to have been introduced by Birgefj] ID in 
1991, and is now in common use. It is defined as the number of states in the minimal deterministic finite 
automaton (DFA) accepting the language [30]. There had been much earlier studies of this topic, but the 
term "state complexity" was not used. For example, in 1963 Lupanov [19] showed that the bound 2 n 
is tight for the conversion of nondeterministic finite automata (NFA's) to DFAs, and he used the term 
slozhnost' avtomatov, meaning complexity of automata representing the same set of words. The case of 
languages over a one-letter alphabet was studied in 1964 by Lyubich |[20l . Lupanov's result is almost 
unknown in the English-language literature, and is often attributed to the 1971 paper by Moore ll22ll . 

*This research was supported by the Natural Sciences and Engineering Research Council of Canada under grant no. 
OGP0000871. 
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In 1970, Maslov ETTl studied the complexity of basic operations on regular languages, and stated without 
proof some tight bounds for these operations. In the introduction to his paper he states: 

An important characteristic of the complexity of these sets [of words] is the number of states 
of the minimal representing automaton^ 

In 1981 Leiss [18] referred to (deterministic) complexity of languages. Some additional references to 
early works related to this topic can be found in |[T0l[30l . for example. 

A language is a subset of the free monoid E* generated by a finite alphabet E. If state complexity 
is a property of a language, then why is it defined in terms of a completely different object, namely an 
automaton? Admittedly, regular languages and finite automata are closely related, but there is a more 
natural way to define this complexity of languages, as is shown below. 

The left quotient, or simply quotient of a language L by a word w is defined as the language 



The quotient complexity of L is the number of distinct languages that are quotients of L, and will be 
denoted by k(L) (kappa for both kwotient and komplexity). Quotient complexity is defined for any 
language, and so may be finite or infinite. 

Since languages are sets, it is natural to define set operations on them. The following are typical set 
operations: complement (L = E* \ L), union (K U L), intersection (K D L), difference (K \ L), and sym- 
metric difference (K L). A general boolean operation with two arguments is denoted by K o L. Since 
languages are also subsets of a monoid, it is also natural to define product, usually called (concatenation, 
(K ■ L = {w Gl* \ w = uv,u E K,v E L}), star (K* = \J i>Q K l ), an d positive closure (K + = \Ji>i 

The operations union, product and star are called rational or regular. Rational ( or regular) languages 
over £ are those languages that can be obtained from the set {0, {e}} U {{a} | a E E} of basic languages, 
where e is the empty word, (or, equivalently, from another basis, such as the finite languages over E) using 
a finite number of rational operations. Since it is cumbersome to describe regular languages as sets — 
for example, one has to write L = ({e} U {a})* • {b} — one normally switches to regular (or rational) 
expressions. These are the terms of the free algebra over the set EU {0,e} with function symbol^] U, ■, 
and * [24]. For the example above, one writes E = (eUa)* -b. The mapping £ from this free algebra 
onto the algebra of regular languages is defined inductively as follows: 



where E and F are regular expressions. The product symbol • is usually dropped, and languages are 
denoted by expressions without further mention of the mapping C Since regular languages are closed 
under complementation, complementation is treated here as a regular operator. 

Because regular languages are defined by regular expressions, it is natural to use regular expressions 
also to represent their quotients; these expressions are their derivatives [4]. First, the e-function of a 
regular expression L, denoted by L e , is defined as follows: 



w L = {x G E* | wx E L}. 



£(0) = 0, C(e) = {e}, C(a) = {a} 



C(EUF)=£(E)U£(F), £(E ■ F) = C(E) ■ £(F), C(E*) = (C(E)) 



a; 



£ 



{ 



0, if a = 0, or a E E; 
e, if a = e. 



(1) 



9 

The emphasis is mine. 
3 The symbol + is used instead of U in 1241 . 
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(KULf = K £ UL £ , (KL) £ = K £ nL £ , (L*) £ = e. (3) 

One verifies that C(L £ ) = {e} if e £ L, and C{L £ ) = 0, otherwise. 

The derivative by a letter a G L of a regular expression L is denoted by L a and defined by structural 
induction: 

J 0, if £>G {0,e}, or 6 Gland 6/ a; 
a "\e, if 6 = a. (4) 

(L) a =i:, (KUL) Q = K a UL a , (KL) a = K a LUK £ L a , (L*) a = L a L*. (5) 

The derivative by a word w G £* of a regular expression L is denoted by L w and defined by induction 
on the length of w: 

L £ = L, L w = L a , if w = a G I, L„, Q = (Lu,) a - (6) 

A derivative L w is accepting if e G ; otherwise it is rejecting. 

One can verify by structural induction that C{L a ) = a~ l L, for all a G L, and then by induction on 
the length of w that, for all w G £*, 

£(L U ,) = ^ 1 L. (7) 

Thus every derivative represents a unique quotient of L, but there may be many derivatives representing 
the same quotient. 

Two regular expressions are similar J3l if one can be obtained from the other using the following 
rules: 

LUL = L, K U L = LU K, K U (L U M) = (K U L) U M, (8) 

LU0 = L, 0L = L0 = 0, eL = Le = L. (9) 

Upper bounds on the number of dissimilar derivatives, and hence on the quotient complexity, were 
derived in (3] 01: If m and n are the quotient complexities of K and L, respectively, then 

k(L) = k(L), k(KUL) <mn, k(KL) < m2 n , k(L*) < 2 n - 1. (10) 

This immediately implies that the number of derivatives, and hence the number of quotients, of a regular 
language is finite. 

It seems that the upper bounds in Equation (fTOl) . derived in 1962 EllH, were the first "state complex- 
ity" bounds to be found for the regular operations. Since the aim at that time was simply to show that the 
number of quotients of a regular language is finite, the tightness of the bounds was not considered. 

Of course, the concepts above are related to the more commonly used ideas. A deterministic finite 
automaton, or simply automaton, is a tuple 

A = (Q,L,5,q ,F), 

where Q is a finite, non-empty set of states, £ is a finite, non-empty alphabet, 5 : Q x £ — > Q is the 
transition function, go € Q is the initial state, and F C Q is the set of final states. The transition function 
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is extended to 5 : Q x £* — » Q as usual. A word u> is recognized (or accepted) by automaton „4 if 
5(^0, G F. It was proved by Nerode [23] that a language L is recognizable by a finite automaton if 
and only if L has a finite number of quotients. 

The quotient automaton of a regular language L is „4 = (Q,L,5, qo,F), where Q = {w~ l L \ w G £*}, 
5(w~ l L,a) = (wa)~ l L, qo = e~ l L = L, and F = {w~ Y L \ e G 

It should now be clear that the state complexity of a regular language L is the number of states in its 
quotient automaton, i. e., the number k(L) of its quotients. This terminology change may seem trivial, 
but has some nontrivial consequences. 

For convenience, derivative notation will be used to represent quotients, in the same way as regular 
expressions are used to represent regular languages. 

By convention, L e w always means {L w ) £ . 

Several proofs are omitted because of space limitations. 

3 Derivation of bounds using quotients 

Since languages over one-letter alphabets have very special properties, we usually assume that the alpha- 
bet has at least two letters. The complexity of operations on unary languages has been studied in Il25ll30ll . 

In the literature on state complexity, it is assumed that automata A and B accepting languages K and 
L, respectively, are given. An assumption has to be made that the automata are "complete", i. e., that for 
each q G Q and o£l, S(q,a) is defined 02"Tl . In particular, if a "dead" or "sink" state, which accepts 
no words, is present, one has to check that only one such state is included 0. Also, every state must be 
"useful" in the sense that it appears on some accepting path Q. 

Suppose that a bound on the state complexity of f(K, L) is to be computed, where / is some regular 
operation. In some cases a DFA accepting f(K,L) is constructed directly, (e. g., Theorems 2.3 and 3.1 
in 021 "). or an NFA with multiple initial states is used, and then converted to a DFA by the subset 
construction (e. g., Theorem 4.1 in ll32l ). Sometimes an NFA with empty-word transitions is used and 
then converted to a DFA [28). The constructed automata then have to be proved minimal. 

Much of this is unnecessary. If quotients are used, the problem of completeness does not arise, since 
all the quotients of a language are included. A quotient is either empty or "useful". If the empty quotient 
is present, then it appears only once. Since quotients are distinct languages, the set of quotients of a 
language is always minimal. To find an upper bound on the state complexity, instead of constructing an 
automaton for f(K, L), we need only find a regular expression for the typical quotient, and then do some 
counting. This is illustrated below for the basic regular operations. 

3.1 Bounds for basic operations 

The following are some useful formulas for the derivatives of regular expressions: 
Theorem 1. If K and L are regular expressions, then 



{L) 



w 



L 



'W i 



(11) 



(KoL) 



w 



K w o L 



(12) 
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{KL) W = K W LUK £ L W U\ (J K £ U L 



V 



(13) 



For the Kleene star, (L*) e = eU LL*, and for w £ T + , 



(L*) 




U ( L TuLv )L*. 



(14) 



Theorem [J can be applied to obtain upper bounds on the complexity of operations. In Theorem |2] 
below, the second part is a slight generalization of the bound in Theorem 4.3 of IT321 . The third and fourth 
parts are reformulations of the bounds in Theorem 2.3 and 2.4, and of Theorem 3.1 of [32]: 

Theorem 2. For any languages K and L with k{K) = m and k(L) = n: 
1. n{L) = n. 
1. k{K oL) < mn. 

3. Suppose K has k accepting quotients and L has I accepting quotients. 

(a) Ifk = 0orl = 0, then k(KL) = 1. 

(b) Ifk,l > and n=l, then k(KL) <m-(k-\). 

(c) Ifk,l >0andn>\, then k{KL) < m2 n - hi 71 ' 1 . 

4. (a) lfn=\, then k(L*) < 2. 

(b) lfn>\ and L e is the only accepting quotient of L, then k(L*) = n. 

(c) If n > 1 and L has I > accepting quotients not equal to L, then k(L*) < 2 n_1 +2 n ~' -1 . 

Proof: The first part is well-known, and the second follows from (fT2l . 

For the product, if k = or / = 0, then KL = and k(KL) = 1. Thus assume that k,l > 0. If 
n = 1, then L = X* and w € K implies (KL) W = £*. Thus all k accepting quotients of K produce 
the one quotient £* in KL. For each rejecting quotient of K, we have two choices for the union of 
quotients of L in (fT3l : the empty union or Z*. If we choose the empty union, we can have at most m — k 
quotients of KL. Choosing Z* results in (KL) W = £*, which has been counted already. Altogether, 
there are at most 1 + m — k quotients of KL. Suppose now that k,l > and n > 1. If w ^ K, then 
we can choose K w inm — k ways, and the union of quotients of L in 2 n ways. If w € K, then we can 
choose K w in k ways, and the set of quotients of L in 2 n_1 ways, since L is then always present. Thus 
we have {m — k)2 n + k2 n ~ 1 . 

For the star, if n = 1, then L = or L = £*. In the first case, L* = e, and n(L*) = 2; in the second 
case, L* = L* and k(L*) = 1. Now suppose that n > 1; hence L has at least one accepting quotient. If L 
is the only accepting quotient of L, then L* = L and k(L*) = k(L). 

Now assume that n > 1 and / > 0. From (fT4l . every quotient of L* by a non-empty word is a union 
of a subset of quotients of L, followed by L*. Moreover, that union is non-empty, because (L*) 6 e L w is 
always present. We have two cases: 

1. Suppose L is rejecting. Then L has / accepting quotients. 

(a) If no accepting quotient of L is included in the subset, then there are 2 n ~ l — 1 such subsets 
possible, the union being non-empty because L w is always included. 
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(b) If an accepting quotient of L is included, then e G (L*) w , {L*) e w = e, and L = {L*) e w L £ is 
also included. We have 2—1 non-empty subsets of accepting quotients of L and 2 n ~ l ~ l 
subsets of rejecting quotients, since L is not counted. 

Adding 1 for (L*) E , we have a total of 2 n ~' - 1 + (2 l - \)2 n ~ 1 ^ + 1 = 2 n ~ l + 2 n - l - i . 
2. Suppose L is accepting. Then L has I + 1 accepting quotients. 

(a) If there is no accepting quotient, there are 2 n ~'~ 1 — 1 non-empty subsets of rejecting quo- 
tients. 

(b) If an accepting quotient of L is included, then L is included, and 2™~ 1 subsets can be added 
to L. 

We need not add (L*) e , since e U LL* = LL* in this case, and this has already been counted. The 
totalis 2 n " 1 +2 n -'-l. 

The worst-case bound of 2 n_1 + 2 n ~ l ~ l occurs in the first case only. □ 



3.2 Witnesses to bounds for basic operations 

Finding witness languages showing that a bound is tight is often challenging. However, once a guess is 
made, the verification can be done using quotients. 

Let \w\ a be the number of a's in w, for a G £ and w G £*. Unary, binary, and ternary languages are 
languages over a one-, two-, and three-letter alphabet, respectively. 

• Union and Interse ction If we have a bound for intersection, then for union we can use the fact 
that k(K (J L) = k(KUL) = k(KPiL); thus the pair (K,L) is a witness for union. Similarly, 
given a witness for union, we also have a witness for intersection. 

The upper bound mn for the complexity of intersection was observed in 19570 by Rabin and 
Scott [26]. Binary languages 

K = {w <E {a, b}* | \w\ a = m — I mod m} 

and 

L = {w G {a, b}* | \w\b = n — 1 mod n} 

have quotient complexities m and n, respectively. In 1970 Maslov 1211 stated without proof that 
KUL meets this upper bound mn. Yu, Zhuang and K. Salomaa 11321 , used similar languages 

K' = {w G {a, b}* | |u;| a = mod m} 

and 

L' = {w G {a, b}* | \w\b = mod n} 

for intersection, apparently unaware of f2l|. Hricko, Jiraskova and Szabari ifTUl showed that a 
complete hierarchy of quotient complexities of binary languages exists between the minimum 
complexity 1 and the maximum complexity mn. More specifically, it was proved that for any 
integers m, n, a such that m > 2, n > 2 and 1 < a < mn, there exist binar>H languages K and L 
such that k(K) = m, k(L) = n, and k(K U L) = a, and the same holds for intersection. 



The work was done in 1957, but published in 1959. 

5 The proof in [ 10 1 is for ternary languages; a proof for the binary case can be found in [ 9 1 . 
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For a one-letter alphabet £ = {a}, Yu showed that the bound can be reached if m and n are 
relatively prime QUI . The witnesses are K" = (a m )* and L" = (a n )*. For other cases, see the 
paper by Pighizzini and Shallit [25]. 

• Set difference For set difference we have k(K' \ L') = k(K' n L')\ thus the pair (K',L') is a 
witness. 

• Symmetric difference For symmetric difference, let m,n > 1, let K = (b*a) m ~ l (a U b)* and 
let L = (a*b) n ~ l (a U b)*. There are ran words of the form a l b> ' , where < i < m — 1 and 
< j < n — 1. We claim that all the quotients of K © L by these words are distinct. Let x = a % \P 
and y = a k b l . If % < k, let u = a m ~ l ~ k b n . Then xu £ K, yu G K, and xu,yu € L, showing that 
xueK®L, and yu^K®L, i. e., that (^©L)^ / (if ©L) y . Similarly, if j < /, let v = a m b n ~ l ~ l . 
Then xv € if © L, but yv ^ K ®L. Therefore all the quotients of K © L by these mn words are 
distinct. 

For a one-letter alphabet, the witnesses are K" and L" as in the case of union above. 

• Other boolean functions There are six more two-variable boolean functions that depend on both 

variables: KUL = WnL,KnL = IOjL,K\JL = K\L,KnL = L\K, KUL = L\K, and 
K®L. The witnesses for these functions can be found using the four functions above. 

• Product The upper bound of m2 n — 2 n_1 was given by Maslov in 1970 11211 . and he stated without 
proof that it is tight for binary languages 

K = {w € {a, b}* | \w\ a = m — 1 mod m} 

and 

L = {a*b) n - 2 {aUb)(bU a(aUb))* . 

The bound was refined by Yu, Zhuang and K. Salomaa ll32ll to ml n — k2 n ~ 1 , where k is the 
number of accepting quotients of K. Jirasek, Jiraskova and Szabari [11] proved that, for any 
integers m, n, k such that m>2,n>2 and < k < m, there exist binary languages K and L such 
that k(K) = m, k(L) = n, and k(KL) = m2 n — k2 n ~ i . Furthermore, Jiraskova [13] proved that, 
for all m, n, and a such that either n = 1 and I <a <m, orn>2 and 1 < a < m2 n — 2 n ~ l , there 
exist languages K and L with k(K) = m and k(L) = n, defined over a growing alphabet, such 
that k(KL) = a. 

For a one-letter alphabet, mn is a tight bound for product if m and n are relatively prime [32]. The 
witnesses are K = (a m )*a m - 1 and L = (a n )*a n - 1 . See also B51 . 

• Star Maslov [21] stated^ without proof that k(L*) < 2 n ~ l + 2 n ~ 2 , and provided a binary language 
meeting this bound. Three cases were considered by Yu, Zhuang and K. Salomaa 11321 : 

- n = 1. If L = 0, then k(L) = 1 and k(L*) = 2. If L = £*, then k(L*) = 1. 

- n = 2. L = {w G {a, b}* \ \w\ a = 1 mod 2} has k{L) = 2, and k(L*) = 3. 

- n > 2. Let £ = {a, b}. Then L = (b U aL n ~ 1 )*aL n has n quotients, one of which is 
accepting, and k(L*) = 2 n ~ l + 2 n ~ 2 . This example is different from Maslov's. 

Moreover, Jiraskova lfl2l proved that, for all integers n and a with either 1 = n < a < 2, or 
n > 2 and 1 < a < 2 n ~ l +2 n ~ 2 , there exists a language L over a 2 n -letter alphabet such that has 
k(L) = n and k(L*) = a. 

For a one-letter alphabet, n 2 — 2n+2 is a tight bound for star 11321 . The witness is L" = {a n )*a n ~ x . 
See also 11251 

6 The bound is incorrectly stated as 2 n ~ 1 + 2™~ 2 — 1, but the example is correct. 
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4 Generalization of "non-returning" state 

A quotient L w of a language L is uniquely reachable if L x = L w implies that x = w. If L wa is uniquely 
reachable for o£E, then so is L^. Thus, if L has a uniquely reachable quotient, then L itself is uniquely 
reachable by the empty word, i. e. , the minimal automaton of L is non-returning Thus the set of 
uniquely reachable quotients of L is a tree with root L, if it is non-empty. 

We now apply the concept of uniquely reachable quotients to boolean operations and product. 

Theorem 3. Suppose k(K) = m, k(L) = n, K and L have m u and n u uniquely reachable quotients, 
respectively, and there are r words wi such that both K Wi and L Wi are uniquely reachable. If o is a 
boolean operator, then 

k(K o L) < ran — (a + (3 + 7) , where (15) 

a = r(m + n) — r(r + 1); [3 = (m u — r)(n — (r + 1)); 7 = (n u — r){m — m u — 1). (16) 

If K has k accepting quotients, t of which are uniquely reachable, and s rejecting uniquely reachable 
quotients, then 

k{KL) < m2 n — k2 n ~ 1 — s (2 n — 1 ) — t(2 n ~ 1 — 1 ) . (17) 

The following observation was stated for union and intersection of finite languages in QUI ; we add 
the suffix-free case: 

Corollary 4. If K and L are non-empty and finite or suffix-free languages and n(K) = m > 1, 
k{L) = n > 1, then k{K o Li) < mn — (m + n — 2). 

The bound mn — (m + n — 2) for union of suffix-free languages was shown to be tight for quinary 
languages by Han and Salomaa 0. It is also tight for the binary languages K = a((ba*) m ~ b)*(ba*) m ~ 3 
and L = a((aUb) n ^ 3 b)*(aUb) n ~ 3 , as shown recently by Jiraskova and Olejar lfl6l . 



b b 




Figure 1 : Illustrating unique reachability. 

Example 5. The automaton of Fig. [T](a) accepting K has m = 7 and four uniquely reachable states: 1, 
2, 3, and 4. The automaton of Fig.[T](b) accepting L has n = 5 and three uniquely reachable states: 1, 2, 
and 5. In pairs (1,1) and (2,2) both states are reachable by the same word (e and b, respectively); hence 
r = 2. 



7 The term "non-returning" suggests that once a state is left it cannot be visited again. However, such non-returning states 
are not necessarily uniquely reachable. 
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The mxn = 7x5 table of all pairs is shown below, where uniquely reachable states are in boldface 
type. We have a = 18, where the removed pairs are all the pairs in the first two rows and columns, except 
(1, 1) and (2,2). Next, (3 = 4, and we remove the pairs (3,4), (3,5), (4,3) and (4,5) from rows 3 and 4. 
Finally, 7 = 2, and we remove the pairs (6, 5) and (7,5) from column 5. 

(1,1) (1,2) (1,3) (1,4) (1,5) 

(2,1) (2,2) (2,3) (2,4) (2,5) 

(3,1) (3,2) (3,3) (3,4) (3,5) 

(4,1) (4,2) (4,3) (4,4) (4,5) 

(5,1) (5,2) (5,3) (5,4) (5,5) 

(6,1) (6,2) (6,3) (6,4) (6,5) 

(7,1) (7,2) (7,3) (7,4) (7,5) 

Altogether, we have removed 24 states from K oL, leaving 11 possibilities. The minimal automaton of 
KUL has 8 states. Notice that state 7 corresponds to the quotient £*. Since £* U L w = Z* for all w, we 
need to account for only one pair (7,x), and we could remove the remaining four pairs. However, we 
have already removed pair (7,5) by Theorem [3] Hence, there are only three pairs left to remove, and we 
have an automaton with 8 states. More will be said about the effects of £* later. 

It is also possible to use Theorem [3] if K has some uniquely reachable quotients and L has none, or 
when L is completely unknown. If n u = 0, then r = 0, a = 0, = m u (n — 1), and 7 = 0. Then, for 
any L, 

k{K o L) < mn — m u (n — 1). (18) 

For example, for any L with n = 101 and K as in Fig. Q](a), k(K(~\L) < 307, instead of the general 
bound 707. 

Let K and L be the automata of Fig. [U(a) and (b), respectively. Then the general bound on k(KL) 
is 192. Here s = 3 (states 1, 2, and 4), and t = 1 (state 3). By Theorem [3] the bound is reduced by 
93 + 15 = 108 to 84. The actual quotient complexity of KL is 14. 

The general bound for LK is 512, the reduced bound is 195, and the actual quotient complexity 
is 12. o 



5 Languages with e, L + , 0, or L* as quotients 

In this section we consider the effects of the presence of special quotients in a language. In particular, 
we study the quotients e, £ + , 0, and E*. 

Theorem 6. If k(K) = m, k(L) = n, and K and L have k > and I > accepting quotients, respec- 
tively, then 

1 . If K and L have e as a quotient, then 

• k(KUL) <mn-2. 

• k(KC\L) < mn — (2m + 2n — 6). 

• k(K\ L) < mn — (m-\-2n — k — 3). 

• k(K © L) < mn-2. 

2. If K and L have Z + as a quotient, then 

• K(KnL) < mn-2. 

• k(K\JL) < mn — (2m + 2n — 6). 
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• k(K\L) < mn- (2m + Z-3). 

• k{K®L) < mn — 2. 

3. If K and L have as a quotient, then 

• k(KPiL) < mn — (m + n — 2). 

• k(K\L) < mn — n+ 1. 

4. If K and L have Z* as a quotient, then 

• n(K\JL)<mn — {m + n — 2). 

• k(K\L) < mn — m+ 1. 

5. • TfL /jcj^ e as a quotient, then k(L r ) < 2 n_2 + 1. 

• TfL /jcj^ Z + <35 a quotient, then k(L r ) < 2 n ~ 2 + 1. 

• If L has as a quotient, then n(L R ) < 2 n ~ [ . 

• If L has Z* as a quotient, then k(L r ) < 2 n ~ l . 

• Moreover, the effect of these quotients on complexity is cumulative. For example, if L R has 
both and L*, then k(L r ) < 2 n ~ 2 , ifh R has both and £+ then k(L r ) < 2 n ~ 3 + 1, etc. 

Corollary 7. If K and L are both non-empty and both suffix-free with k(K) = m and k(L) = n, then 

k{K n L) < mn — 2(m + n — 3). 

It is shown in [6] that the bound can be reached with 

K = {#w | w € {a, b}* , \w\ a = mod m — 2}, 

L = {#w | w G {a, b}*, \w\b = mod n — 2}. 

It was recently proved in |[T6l that this bound can be reached by the binary languages given after Corol- 
lary m 

Proposition 8. If n{L) = n > 3, L has I > accepting quotients, and L has e as a quotient, then 
k{L*) <2"- 3 + 2 n -'- 1 + l. 

Proof: If L has e, then it also has 0. From (fl4l) . every quotient of L* by a non-empty word is a union of 
a non-empty subset of quotients of L, followed by L*. We have two cases: 

1. Suppose L is rejecting. 

(a) If no accepting quotient is included, then there are 2™ -i ~ 1 — 1 non-empty subsets of non- 
empty rejecting quotients plus the subset consisting of the empty quotient alone, for a total 
of 2 n ~ l ~ l . 

(b) If an accepting quotient is included in the subset, then so is L. We can add the subset {e} 
or any non-empty subset S of accepting quotients that does not contain e, since S U {e} 
is equivalent to S. Thus we have 2 l ~ l subsets of accepting quotients. To this we can add 
2»i-Z-2 re j ec t m g subsets, since the empty quotient and L need not be counted. The total is 

2^— i2 n— ' — 2 — 2 n— 3 

Adding 1 for (L*) £ , we have a total of 2 n " 3 + 2 n ~ 1 - 1 + 1. 

2. Suppose L is accepting. Since n > 3, we have L ^ e. 

(a) If there is no accepting quotient, there are 2 n ~ /_1 subsets, as before. 

(b) If an accepting quotient is included, then L is included and L itself is sufficient to guarantee 
that (L*) w is accepting. Since LUe = LU0 = L, we also exclude e and 0. Thus any one of 
the 2 n ~ 3 subsets of the remaining quotients can be added to L. 

The total is 2 n ~ 3 + 2 n ~' _1 . We need not add (L*) £ , since it is LL* which has been counted already. 
The worst-case bound of 2 n ~ 3 + 2 n ~ l ~ l + 1 occurs in the first case only. □ 
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6 Conclusions 

Quotients provide a uniform approach for finding upper bounds for the complexity of operations on 
regular languages, and for verifying that particular languages meet these bounds. It is hoped that this is 
a step towards a theory of complexity of languages and automata. 
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